24 research outputs found

    3D Reconstruction with Uncalibrated Cameras Using the Six-Line Conic Variety

    Full text link
    We present new algorithms for the recovery of the Euclidean structure from a projective calibration of a set of cameras with square pixels but otherwise arbitrarily varying intrinsic and extrinsic parameters. Our results, based on a novel geometric approach, include a closed-form solution for the case of three cameras and two known vanishing points and an efficient one-dimensional search algorithm for the case of four cameras and one known vanishing point. In addition, an algorithm for a reliable automatic detection of vanishing points on the images is presented. These techniques fit in a 3D reconstruction scheme oriented to urban scenes reconstruction. The satisfactory performance of the techniques is demonstrated with tests on synthetic and real data

    A framework for the analysis and optimization of delay in multiview video coding schemes

    Full text link
    Esta tesis presenta un novedoso marco de referencia para el análisis y optimización del retardo de codificación y descodificación para vídeo multivista. El objetivo de este marco de referencia es proporcionar una metodología sistemática para el análisis del retardo en codificadores y descodificadores multivista y herramientas útiles en el diseño de codificadores/descodificadores para aplicaciones con requisitos de bajo retardo. El marco de referencia propuesto caracteriza primero los elementos que tienen influencia en el comportamiento del retardo: i) la estructura de predicción multivista, ii) el modelo hardware del codificador/descodificador y iii) los tiempos de proceso de cuadro. En segundo lugar, proporciona algoritmos para el cálculo del retardo de codificación/ descodificación de cualquier estructura arbitraria de predicción multivista. El núcleo de este marco de referencia consiste en una metodología para el análisis del retardo de codificación/descodificación multivista que es independiente de la arquitectura hardware del codificador/descodificador, completada con un conjunto de modelos que particularizan este análisis del retardo con las características de la arquitectura hardware del codificador/descodificador. Entre estos modelos, aquellos basados en teoría de grafos adquieren especial relevancia debido a su capacidad de desacoplar la influencia de los diferentes elementos en el comportamiento del retardo en el codificador/ descodificador, mediante una abstracción de su capacidad de proceso. Para revelar las posibles aplicaciones de este marco de referencia, esta tesis presenta algunos ejemplos de su utilización en problemas de diseño que afectan a codificadores y descodificadores multivista. Este escenario de aplicación cubre los siguientes casos: estrategias para el diseño de estructuras de predicción que tengan en consideración requisitos de retardo además del comportamiento tasa-distorsión; diseño del número de procesadores y análisis de los requisitos de velocidad de proceso en codificadores/ descodificadores multivista dado un retardo objetivo; y el análisis comparativo del comportamiento del retardo en codificadores multivista con diferentes capacidades de proceso e implementaciones hardware. ABSTRACT This thesis presents a novel framework for the analysis and optimization of the encoding and decoding delay for multiview video. The objective of this framework is to provide a systematic methodology for the analysis of the delay in multiview encoders and decoders and useful tools in the design of multiview encoders/decoders for applications with low delay requirements. The proposed framework characterizes firstly the elements that have an influence in the delay performance: i) the multiview prediction structure ii) the hardware model of the encoder/decoder and iii) frame processing times. Secondly, it provides algorithms for the computation of the encoding/decoding delay of any arbitrary multiview prediction structure. The core of this framework consists in a methodology for the analysis of the multiview encoding/decoding delay that is independent of the hardware architecture of the encoder/decoder, which is completed with a set of models that particularize this delay analysis with the characteristics of the hardware architecture of the encoder/decoder. Among these models, the ones based in graph theory acquire special relevance due to their capacity to detach the influence of the different elements in the delay performance of the encoder/decoder, by means of an abstraction of its processing capacity. To reveal possible applications of this framework, this thesis presents some examples of its utilization in design problems that affect multiview encoders and decoders. This application scenario covers the following cases: strategies for the design of prediction structures that take into consideration delay requirements in addition to the rate-distortion performance; design of number of processors and analysis of processor speed requirements in multiview encoders/decoders given a target delay; and comparative analysis of the encoding delay performance of multiview encoders with different processing capabilities and hardware implementations

    Congestion control for cloud gaming over udp based on round-Trip video latency

    Full text link
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksWe describe a network congestion control mechanism for cloud gaming (CG) platforms based on the user datagram protocol (UDP). To minimize the contribution of the downstream transmission delay to the total end-To-end latency in the interaction-perception loop, we first define the round-Trip video latency (RTVL) and develop a congestion model. Based on them, we design and implement an adaptation strategy that detects the early stages of congestion to prevent high values of RTVL and network bufferbloat, thus avoiding packet losses. Using data measured from the network, our strategy modifies the target output bitrate of the video encoder to throttle down or upto the data flow sent by the server to the client. In the presence of sudden downstream channel capacity drops of over 40%, our algorithm reactively manages to satisfy the key CG requirements for interactive games by entirely avoiding the packet losses and keeping the RTVL below 100 ms. In reasonably stable network conditions, our algorithm proactively keeps exploring for higher bitrates and building a 'network state dictionary,' due to which it achieves an effective downstream channel capacity use of 95%This work was supported in part by the Ministerio de Ciencia, Innovación y Universidades (AEI/FEDER) of the Spanish Government through the Project ‘‘Open Graphics Gaming Cloud’’ under Grant RTC-2016-5676-7 and the Project ‘‘Immersive Visual Media Environments’’ under Grant TEC2016-7598

    An optimal yet fast pruning algorithm to reduce latency in multiview prediction structures

    Get PDF
    We propose a new algorithm for the design of prediction structures with low delay and limited penalty in the rate-distortion performance for multiview video coding schemes. This algorithm constitutes one of the elements of a framework for the analysis and optimization of delay in multiview coding schemes that is based in graph theory. The objective of the algorithm is to find the best combination of prediction dependencies to prune from a multiview prediction structure, given a number of cuts. Taking into account the properties of the graph-based analysis of the encoding delay, the algorithm is able to find the best prediction dependencies to eliminate from an original prediction structure, while limiting the number of cut combinations to evaluate. We show that this algorithm obtains optimum results in the reduction of the encoding latency with a lower computational complexity than exhaustive search alternatives

    Systematic analysis of the decoding delay on MVC decoders

    Get PDF
    We present a framework for the analysis of the decoding delay and communication latency in Multiview Video Coding. The application of this framework on MVC decoders allows minimizing the overall delay in immersive video-conference systems

    Subjective assessment of super multiview video with coding artifacts

    Get PDF
    The subjective assessment of super multiview (SMV) video considers two main perceptual factors: image quality and visual comfort at the viewpoint transition. While previous works only covered raw content with high levels of visual comfort, this work supersedes them by targeting the subjective assessment of SMV content with coding artifacts. The outcome of this analysis yields important conclusions regarding the relationship between these two factors, indicating that 1) the perceived image quality is independent from the view point change speed, and 2) the perceived visual comfort at the view point transition is independent from the image quality. These conclusions facilitate the extension of the scope of existing subjective perception models, designed for raw SMV content, to coded content

    Systematic analysis of the decoding delay in multiview video

    Get PDF
    We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system

    Analysis of pixel-mapping rounding on geometric distortion as a prediction for view synthesis distortion

    Get PDF
    We analyze the performance of the geometric distortion, incurred when coding depth maps in 3D Video, as an estimator of the distortion of synthesized views. Our analysis is motivated by the need of reducing the computational complexity required for the computation of synthesis distortion in 3D video encoders. We propose several geometric distortion models that capture (i) the geometric distortion caused by the depth coding error, and (ii) the pixel-mapping precision in view synthesis. Our analysis starts with the evaluation of the correlation of geometric distortion values obtained with these models and the actual distortion on synthesized views. Then, the different geometric distortion models are employed in the rate-distortion optimization cycle of depth map coding, in order to assess the results obtained by the correlation analysis. Results show that one of the geometric distortion models is performing consistently better than the other models in all tests. Therefore, it can be used as a reasonable estimator of the synthesis distortion in low complexity depth encoders

    ViCoCoS-3D: Videoconferencing common scenes

    Get PDF
    This paper presents a 3D video dataset containing sequences with typical content from videoconferencing scenarios. The objective of this dataset is to provide freely-available sequences for the research community to support the develop-ment and evaluation of processing techniques applicable to 3D videoconferencing systems. Therefore, a detailed description of the generation process and the content characteristics is provided, together with insights of possible applications of the datase

    Fusion of pose and head tracking data for immersive mixed-reality application development

    Get PDF
    This work addresses the creation of a development framework where application developers can create, in a natural way, immersive physical activities where users experience a 3D first-person perception of full body control. The proposed frame-work is based on commercial motion sensors and a Head-Mounted Display (HMD), and a uses Unity 3D as a unifying environment where user pose, virtual scene and immersive visualization functions are coordinated. Our proposal is exemplified by the development of a toy application showing its practical us
    corecore